AITopics | letter frequency

Collaborating Authors

letter frequency

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ALICE: An Interpretable Neural Architecture for Generalization in Substitution Ciphers

Shen, Jeff, Smith, Lindsay M.

arXiv.org Artificial IntelligenceSep-26-2025

To enhance interpretability, we introduce a novel bijective decoding head that explicitly models permutations via the Gumbel-Sinkhorn method, enabling direct extraction of learned cipher mappings. Our architectural innovations and analysis methods are applicable beyond cryptograms and offer new insights into neural network generalization and interpretability. A cryptogram is a type of puzzle in which text is encrypted using a substitution cipher, and the user's task is to recover the original plaintext by inferring the cipher used for the encryption. Users typically solve cryptograms based on prior knowledge about language letter frequency distributions and common words. Originally developed for real encryption purposes, they are now popular in newspapers and puzzle books for entertainment purposes due to their simplicity. This simplicity, however, provides a unique testbed for testing and understanding generalization and reasoning in neural networks. In a one-to-one monoalphabetic substitution cipher, each letter in a fixed alphabet is mapped to a unique substitute character; this cipher represents a bijective mapping over the alphabet. While other ciphers exist (e.g., Vigen ` ere cipher, Playfair cipher), we focus here on one-to-one monoalphabetic substitution ciphers, as the problem space is extremely large but remains structurally simple to interpret. We hereafter mean one-to-one monoalphabetic substitution cipher when we say "cipher", unless otherwise specified. More formally, let Σ be a finite alphabet of size V representing allowable characters (e.g., 26 for the English alphabet).

cipher, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.07282

Country:

Europe (1.00)
North America > United States (0.93)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

M-IFEval: Multilingual Instruction-Following Evaluation

Dussolle, Antoine, Díaz, Andrea Cardeña, Sato, Shota, Devine, Peter

arXiv.org Artificial IntelligenceFeb-7-2025

Instruction following is a core capability of modern Large language models (LLMs), making evaluating this capability essential to understanding these models. The Instruction Following Evaluation (IFEval) benchmark from the literature does this using objective criteria, offering a measure of LLM performance without subjective AI or human judgement. However, it only includes English instructions, limiting its ability to assess LLMs in other languages. We propose the Multilingual Instruction Following Evaluation (M-IFEval) benchmark, expanding the evaluation to French, Japanese, and Spanish, with both general and language-specific instructions. Applying this benchmark to 8 state-of-the-art LLMs, we find that benchmark performance across languages and instruction types can vary widely, underscoring the importance of a multilingual benchmark for evaluating LLMs in a diverse cultural context.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.04688

Country:

Asia > Thailand > Bangkok > Bangkok (0.05)
North America > United States > New York > New York County > New York City (0.04)
Europe > Faroe Islands > Streymoy > Tórshavn (0.04)
(5 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

An Analysis of Letter Dynamics in the English Alphabet

Zhao, Neil, Zheng, Diana

arXiv.org Artificial IntelligenceJan-27-2024

The tabulation of commonly used letters, as determined by letter frequency, was later utilized to improve typewriter keyboard arrangement by minimizing hand motion [5]. Statistical characteristics of different letters of the English alphabet was further studied in the context of different sentence structures [6]. The letters'B', 'S', 'M', 'H', 'C' were found to most frequently occur as the initial letters of proper nouns, while'E', 'A', 'R', 'N' were the most frequently used letters when the entire proper noun is considered. For entire text documents, the most commonly used letters were found to be'E', 'T', 'A', 'O', 'N'. Interestingly, 95% of the English vocabulary was found to be represented by 13 letters of the alphabet. Our manuscript expanded upon the statistical study of the English alphabet by evaluating letter frequency in the context of different categories of writings. We analyzed news articles, novels, plays, and scientific articles for letter frequency and distribution. As a result, we determined the information density of the letters of the alphabet. Additionally, we developed a metric called "distance, d" to act as a simple algorithm for recognizing writing category.

category, frequency, letter frequency, (16 more...)

arXiv.org Artificial Intelligence

2401.1556

Country:

North America > United States > Oregon (0.04)
South America > Brazil (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(7 more...)

Genre: Research Report (0.64)

Industry:

Materials > Chemicals (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(2 more...)

Technology: Information Technology > Artificial Intelligence (0.46)

Add feedback

How to get the Letter Frequency in Python

#artificialintelligenceOct-4-2020, 16:55:19 GMT

We will provide you a walk-through example of how you can easily get the letter frequency in documents by considering the whole document or the unique words. Finally, we will compare our observed relative frequencies with the letter frequency of the English language. From the above horizontal barplot, we can easily see that the letter e is the most common in both English Texts and Dictionaries. Notice also that the distribution is changed between Texts and Dictionaries. We will work with the Moby Dick book and we will provide the frequency and the relative frequency of the letters.

artificial intelligence, frequency, letter frequency, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence (0.55)

Add feedback

Cryptology from the crypt: How I cracked a 70-year-old coded message from beyond the grave

#artificialintelligenceAug-30-2019, 22:21:58 GMT

In recent weeks I managed to decrypt a difficult cipher that, despite expert codebreakers' best efforts, had remained unsolved for 70 years. The code was created by the late Cambridge professor and scientist Robert Henry Thouless, who passed away in 1984. He created it as a "test of survival" to see if he could communicate with the living after his death. Thouless thought if he successfully transmitted cipher keywords to the living through spiritual mediums and the message was received, this would prove he had survived his death. In 2019, I was more interested in seeing whether computer speed, storage and networking capabilities had advanced enough to break a code that had outlived its maker.

artificial intelligence, cipher, thouless, (15 more...)

#artificialintelligence

Country: Europe > United Kingdom (0.31)

Industry:

Information Technology (0.31)
Government (0.31)

Technology: Information Technology > Artificial Intelligence (0.50)

Add feedback